Development of a prediction model for predicting the prevalence of nonalcoholic fatty liver disease in Chinese nurses: the first-year follow data of a web-based ambispective cohort study

Background Nonalcoholic fatty liver disease (NAFLD) is gradually becoming a huge threat to public health. With complex working characteristics, female nurses had been found with high risk of NAFLD. To develop and validate a prediction model to predict the prevalence of NAFLD based on demographic characteristics, work situation, daily lifestyle and laboratory tests in female nurses. Methods This study was a part of the Chinese Nurse Cohort Study (The National Nurse Health Study, NNHS), and data were extracted from the first-year follow data collected from 1st June to 1st September 2021 by questionnaires and physical examination records in a comprehensive tertiary hospital. The questionnaires included demographic characteristics, work situation and daily lifestyle. Logistic regression and a nomogram were used to develop and validate the prediction model. Results A total of 824 female nurses were included in this study. Living situation, smoking history, monthly night shift, daily sleep time, ALT/AST, FBG, TG, HDL-C, UA, BMI, TBil and Ca were independent risk factors for NAFLD occurance. A prediction model for predicting the prevalence of NAFLD among female nurses was developed and verified in this study. Conclusion Living situation, smoking history, monthly night shift, daily sleep time, ALT/AST, FBG, TG, UA, BMI and Ca were independent predictors, while HDL-C and Tbil were independent protective indicators of NAFLD occurance. The prediction model and nomogram could be applied to predict the prevalence of NAFLD among female nurses, which could be used in health improvement. Trial registration This study was a part of the Chinese Nurse Cohort Study (The National Nurse Health Study, NNHS), which was a ambispective cohort study contained past data and registered at Clinicaltrials.gov (https://clinicaltrials.gov/ct2/show/NCT04572347) and the China Cohort Consortium (http://chinacohort.bjmu.edu.cn/project/102/). Supplementary Information The online version contains supplementary material available at 10.1186/s12876-024-03121-1.


Introduction
Nonalcoholic fatty liver disease (NAFLD) is gradually becoming a huge threat to public health [1].More than 25% of the population has been diagnosed with NAFLD around the world, and the prevalence increases to 32% in the Middle East and 31% in South America [2].The epidemic of NAFLD is also severe in China.As the fastest growing country, the number of patients with NAFLD will reach 314.58 million in 2030 [3].In addition to affecting the structure and function of the liver, NAFLD also has important effects on other organs [4,5].Patients with NAFLD are found to have a high risk of developing type 2 diabetes, cardiovascular and cerebrovascular diseases, chronic kidney diseases, and even death from related diseases [4,5], [6,7].Because early NAFLD is preventable and reversible [7], identifying risk factors for NAFLD and providing related intervention are essential.
Previous studies have explored risk factors for NAFLD and found that age, sex, race, metabolic syndrome (MS), unhealthy lifestyle, such as unbalanced diet, sedentary and low-level physical activity and lack of sleep, were tightly associated with the prevalence of NAFLD [8].Furthermore, several prediction models of NAFLD occurance have been developed, which found that age, ethnicity, sex, exercise, smoking, heart rate, blood pressure, body mass index (BMI), waist circumference, highdensity lipoprotein-cholesterol (HDL-C) and bilirubin could independently predict the prevalence of NAFLD [9,10].However, most of the current models only focus on laboratory tests while neglecting lifestyle could also influence the prevalence of NAFLD.
As a group with high-intensity and night-shift jobs, nurses have been found to have a high prevalence of NAFLD, especially for nurses working in emergency departments, whose prevalence of NAFLD could increase to 28.3% [11].Compared to the general population, nurses experience greater work pressure and quite different lifestyles, such as frequent night shifts and extra meals at night, which are high-risk influencing factors of NAFLD [8].Meanwhile, nurses also experienced more frequent physical activities for constant patient care, which was a protective factor against NAFLD [8].Moreover, females make up the vast majority of nurses, and 97.7% of nurses are female in China [12].For complex working characteristics, it is necessary to develop a special prediction model to predict the prevalence of NAFLD in female nurses.
Therefore, the purposes of this study were to identify risk factors for NAFLD from demographic characteristics, work situation, daily lifestyle and physical examination records to develop a prediction model to predict the prevalence of NAFLD in female nurses to guide them prevent and treat NAFLD accurately.

Study design
This study was a part of the Chinese Nurse Cohort Study (The National Nurse Health Study, NNHS), and registration information for this cohort was included in the protocol for the study [13].The NNHS has been approved by the Medical Research Ethics Committee of Peking University Third Hospital (IRB00006761-M2020306).The Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) statement was applied to standardize the study procedure [14].

Participants
Female nurses registered and licenced practiced in a comprehensive tertiary hospital were recruited, and data were collected from 1 June to 1 September 2021 as the first-year follow data of the NNHS.Nurses who were unwilling to participate in research or with missing data were excluded.Informed consent was obtained from all the participants.

Main outcomes
The prevalence of NAFLD was the main outcome of our study.According to the Guidelines for prevention and treatment of nonalcoholic fatty liver disease in China [4], the diagnostic criteria of NAFLD in this study: [1] there was no history of drinking alcohol or the alcohol equivalent amount was less than 70 g/week [2]; diseases that can lead to fatty liver such as viral hepatitis, druginduced liver disease, and autoimmune liver disease were excluded [3]; imaging of diffuse hepatic steatosis.In this study, the results of abdominal B-ultrasound were used as imaging evidence.

Trial registration
The measurement of those indicators were presented on Table S1.

Statistical analysis
IBM Statistical Version 23.0 (SPSS, Chicago, USA) and R software Version 4.2.1 (R Foundation, Vienna, Austria) containing Packages "rms", "pROC", "rmda", "nricens" and "ggplot2" were used for data description and statistical analysis.The Q test was used for outlier testing.Classified variables were described as percentage (%), and continuous variables were expressed as mean and standard deviation or median (quartiles).For classified variables, using the chi-square test was used for analysing, and continuous variables were analysed by independent sample t tests or ANOVA.According to the results of univariate analysis in the development set and validation set, factors significantly associated with NAFLD occurance were included in binary logistic regression analysis in the development set (P < 0.10).Factors that were significantly associated with NAFLD occurance by binary logistic regression analysis were included in the prediction model (p < 0.05).Odds ratios (ORs) and 95% confidence intervals (CIs) were calculated.
A nomogram was drawn based on the results of the binary regression analysis and the prediction model was developed.The discrimination ability of the model was evaluated by the Harrell consistency index (C-index) and receiver operating characteristic (ROC) curve.The calibration of the prediction model was evaluated by calibration curves.The clinical utility of the prediction model was evaluated by decision curve analysis (DCA).The C-index, AUROC and calibration curves were analysed by1000 bootstrap resamples.

Participants
A total of 824 female nurses were included in our study, while 569 were included in the development criteria and 255 were included in the validation set.The follow chart was shown in Fig. 1.The mean age of the total participants was 32.62 ± 7.07 years, and the service year was 11 [7,17] years, in which the prevalence of NAFLD was 15.5% (128/824).In the development set, the mean age was 32.66 ± 6.84 years, and the prevalence of NAFLD was 15.8% (90/565).In the validation set, the mean age was 32.54 ± 7.56 years, and the prevalence of NAFLD was 14.9% (38/255).There was no significant difference in characteristics between the development set and validation set (Table S2).The characteristics of participants with NAFLD and non-NAFLD are presented in Table 1.

Model development
Based on the results of univariate analysis, 26 indicators were recognized as risk factors for NAFLD indicators in the development or validation set, in which 4 were demographic characteristics, 5 were work situation, 2 were daily lifestyle and 15 were laboratory tests (p < 0.10) (Table 1).The detailed parameters of the characteristics in univariate analysis are presented in Table 1.
As a result of binary logistic regression analysis, 10 indicators were recognized as risk factors for NAFLD development (p < 0.05) (Table 2).The results of binary logistic regression analysis showed that living situation, smoking history, monthly night shift, daily sleep time, ALT/AST, FBG, TG, HDL-C, UA, BMI, TBil and Ca were independent risk factors for NAFLD prevalence.Then, the NAFLD risk nomogram was built based on the 10 independent predictors described above (Fig. 2).

Model validation
The C-index of the nomogram in the development set and validation set were 0.97 and 0.93, respectively, which indicated that the nomogram had pretty discrimination and prediction abilities.The AUROC of the nomogram in the development set was 0.97 (95% CI, 0.96-0.99),while the sensitivity was 0.93 and the specificity was 0.93.The AUROC of the nomogram in the validation set was 0.98 (95% CI, 0.97-0.98),while the sensitivity and specificity were 0.97 and 0.90, respectively (Figure S1).Therefore, the nomogram performed well.
The calibration curves of the development set and validation set are presented in Figure S2 and indicated that the nomogram had good agreement between the predicted probabilities and the actual observed probabilities.DCA showed that the application of the nomogram in female nurses to predict the risk of NAFLD was more effective than the intervention-for-all-patients scheme (Figure S3).

Discussion
To our knowledge, this may be the first study to investigate the prevalance and influencing factors of NAFLD and develop a prediction model in female nurses.Based on the baseline NNHS data, we developed and validated a prediction model for predicting the prevalence of NAFLD in female nurses.Meanwhile, we developed an intuitive nomogram to visualize predictive models for clinical use.The model displayed excellent discrimination and clinical value.Predictors in the model included BMI, FBG, TG, HDL, AST, ALT, Tbil, UA and monthly night shift, which contained some risk factors consistent with previous studies NAFLD [9,10].and one factor reflecting the professional characteristics of nurses.
In this study, the prevalence of NAFLD was 15.5%, which was slightly lower than previous studies conducted in healthy participants or patients with T2DM [15].Although previous studies found that the prevalence of NAFLD was higher among nurses [11], we failed to find similar results.This may be the result of different department distributions, while only 7.65% of female nurses included in this study were working in emergency and critical care units.Furthermore, only females were recruited in this study, but males had been found to have a higher risk of NAFLD (OR 1.779, 95% CI 1.676-1.888)[9].The participants included in this study were much younger than those in previous studies, while older age was an independent predictor of NAFLD occurance [9,16].These may also be the reasons for the results.
Many previous studies have proven that obesity is an important risk factor for NAFLD [17,18].A study in China showed that the risk of NAFLD increases with BMI, even in nonobese individuals [10].Similar conclusions also appeared in our research.Meanwhile, T2DM has also been found to be a prodictor for NAFLD [17].In our study, FBG was still the strongest predictor of the prevalence of NAFLD, regardless of whether female nurses had diabetes.
Increased TGs and decreased HDL-C concentrations always appeared in NAFLD [19], as did our study.As  the predominant form of fat accumulation in the liver, increased TG has been found to be strongly associated with NAFLD [20].HDL is a substance that transports triglycerides in the liver, so a high concentration of HDL is a protective factor for NAFLD, while previous studies also showed that TG/HDL-C may be a good predictor of NAFLD [21].
AST and ALT can be increased without accompanying symptoms.ALT is most closely related to liver fat accumulation, even within the normal reference range [22], and is often used as a surrogate marker for NAFLD in epidemiological studies [23].Recent studies have shown that the ALT/AST ratio may be more sensitive and specific as a marker of NAFLD than ALT and AST alone, especially for patients with normal ALT and AST [22].Similarly, we found that higher ALT/AST were related to an increased prevalence of NAFLD, which was consistent with previous studies.
Furthermore, we also found that bilirubin was a protective factor against the occurance of NAFLD.As the end product of heme metabolism, the beneficial properties of bilirubin have attracted increasing attention, such as antioxidant and anti-inflammatory effects [24].Meanwhile, oxidative stress and the inflammatory response have been proven to be important contributors to the pathogenesis of NAFLD [4].Therefore, higher bilirubin may be a protective facto of NAFLD via antioxidant and anti-inflammatory effects.Meanwhile, Higher bilirubin was associated with lower incidence of abdominal obesity and metabolic syndrome, while was abdominal obesity and metabolic syndrome were risk factors of NAFLD [25,26].These may be the reason that bilirubin was a protective factor against the occurance of NAFLD.
In addition, we also found that the increase in serum calcium was related to the occurrence of NAFLD, which might be a dose-response effect.A study of nonalcoholic fatty liver disease in South Korea reached a similar conclusion [27].Meanwhile, previous studies have also confirmed that serum calcium has a significant correlation with insulin resistance, abnormal glucose metabolism and abnormal lipid metabolism [28].However, the conclusions of current studies on the relationship between serum calcium and NAFLD are not consistent, and more studies are needed to verify this hypothesis.
The most important finding of our study was that we indicated lifestyle could also influence the prevalence of NAFLD, such as living situation, smoking history and sleep time.Previous studies found that higher ultra-processed food consumption was associated with eating with family members, and eating with friends" was associated with lower ultra-processed food consumption [29].That may be the reason that living with spouse and parents were associated with higher risk of NAFLD.As for smoking history, compared with no smoking or passive smoking, we found that smoking particiants had much higher risks of NAFLD, which was consistent with previous studies [30,31].Although no studies paid attention on the relation between sleep time and NAFLD, trouble sleeping was positively associated with NAFLD [32].Meanwhile, 7-8 h sleep was also the nadir for associations with all-cause, cardiovascular disease, and othercause mortality [33].
Meanwhile, we also found night shift was associated with the occurance of NAFLD in female nurses.Previous studies have shown that exposure to light at night may lead to insufficient melatonin secretion and disorders of liver metabolism [34], while Irregular-shift work have been fould that are associated with pathological liver fat accumulation [35].Meanwhile, prolonged night work has been found that could increase nurses' risk of dyslipidemia and abnormal liver function [36].And circadian misalignment may have an underlying pathogenic role.This may be the reason why a more frequent night shift was associated with the prevalence of NAFLD.
Although not included in the model, several indicators of work situation, such as human resources, work department and service years, were also found to be associated with the prevalence of NAFLD in univariate analysis.Meanwhile, exercise habits and weekly family time could also influence the prevalence of NAFLD in univariate analysis.Therefore, it is important to develop a more appropriate shift system to control the monthly shift number and increase family and sleep time, which are challenges for nurse managers and policy makers.

Strengths and limitations
This study is the first to focus on the predictors of nonalcoholic fatty liver disease in female nurses.Meanwhile, our study is also the first to add nurses' work characteristics, daily lifestyle and social psychological indicators to the prediction model of NAFLD occurance, which could provide recommendations for the prevention and treatment of NAFLD among nurses and other groups with similar work characteristics.This study also has several limitations.First, the data of this study were obtained from baseline data of NNSH that were collected at one hospital, which may lead to limited representation.In the future, the follow-up data of the NNHS could be used to verify the prediction model, and multicenter studies are also needed.In addition, NAFLD was diagnosed using ultrasonography in this study, which was greatly dependent on the proficiency of doctors.Although the doctors

Fig. 1
Fig. 1 Flow chart of this study

ConclusionA
prediction model for predicting the prevalence of NAFLD among female nurses was developed and verified in this study.Living situation, smoking history, monthly night shift, daily sleep time, ALT/AST, FBG, TG, UA, BMI and Ca were independent predictors, while HDL-C and Tbil were independent protective indicators of NAFLD occurance.The model displayed excellent discrimination and clinical value, which has clinical significance for identifying the risk of NAFLD and can provide some suggestions for the prevention and treatment of NAFLD.

Fig. 2
Fig. 2 Developed nomogram for predicting the prevalance of NAFLD among nurse This study was a part of the Chinese Nurse Cohort Study (The National Nurse Health Study, NNHS), which was a ambispective cohort study contained past data and registered at Clinicaltrials.gov(https://clinicaltrials. gov/ct2/show/NCT04572347) and the China Cohort Consortium (http://chinacohort.bjmu.edu.cn/project/102/).

Table 1
Characteristics of NAFLD and non-NAFLD patients in the development set and validation set

Table 2
The results of binary logistic regression analysis included in this study were experienced, there still existed a possibility of diagnostic bias.